ROCm と HIP：詳細な10章チュートリアル：並列の転換点：逐次論理からGPUスレッドへのマッピング

「並列の転換点は、計算上の哲学における根本的な転換を表しており、 時間的な順序（一つずつ順番に処理する） （一つずつ順番に処理する）から 空間的な分布（グリッド全体で同時にすべてを処理する） （グリッド全体で同時にすべてを処理する）へと移行する。

1. 独立性のヒューリスティクス

これはGPU計算における黄金法則です： 「あなたの問題が『N個の要素に対して何らかの操作を独立して適用する』場合、最初に試すべきマッピングです。」 このデータ並列アプローチは、スレッド管理のオーバーヘッドが巨大な同時処理能力に比べて無視できるほどであるため、GPU加速の手に入りやすい成果物です。

2. 精度とデータペイロード

HIPカーネルは通常、大規模なプリミティブ型の配列を扱います。高性能グラフィックスや機械学習では、一般的に float （単精度）を使用しますが、極めて高い数値安定性を必要とする科学的シミュレーションでは double （倍精度）を使用します。

3. イテレーションから占有へ

CPUコードでは、プロセッサがループを通じてデータに「訪問」します。GPUの論理では、データがスレッドに「占有」されます。あなたは どのようにループするか を書くのを止め、代わりに 特定の座標でのみ1つのワーカーがすべきことを書き始めます。

$$\text{インデックス } i = \text{blockIdx.x} \times \text{blockDim.x} + \text{threadIdx.x}$$

TERMINALbash — 80x24

> Ready. Click "Run" to execute.

QUESTION 1

What is the primary heuristic for deciding if a problem is suitable for the 'Parallel Pivot'?

The problem requires complex recursion.

The problem involves applying an operation independently to N elements.

The problem must be solved in a strict temporal order.

The problem uses only integer arithmetic.

QUESTION 2

In the context of the Parallel Pivot, what does the term 'Occupation' refer to?

The CPU visiting each index in a for-loop.

How many blocks are currently queued in the GPU.

Data 'occupying' a specific thread at a specific coordinate.

The percentage of memory used by the float arrays.

QUESTION 3

Which data types are most commonly handled by HIP kernels for high numerical stability in science?

bool and char

int and long

float and double

void and pointer

QUESTION 4

When pivoting a loop into a kernel, what replaces the loop counter `i`?

The return value of the function.

A global thread identity calculated from grid/block dimensions.

The hipMalloc address.

The host-side iteration variable.

QUESTION 5

Fill in the blank: To ensure production reliability even in basic kernels, you must ______.

Only use float types.

Add explicit error-checking macros everywhere.

Use a single thread per block.

Avoid all boundary checks.